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ABSTRACT 


The  contents  of  this  report  provide  some  general  guidelines  and 
insights  into  the  design  of  an  adaptive  signal  detection  system  based  on 
quantized  data.  In  particular,  we  focus  on  the  adaptation  of  the  param¬ 
eters  in  the  quantizer  in  a  way  such  that  the  detector's  probability  of 
making  errors  in  the  presence  of  noise  of  unknown  statistics  will  be 
minimized  with  respect  to  those  parameters.  The  unknown  noise  is  assumed 
throughout  this  report  to  be  an  independent,  additive  noise  with  a 
symmetric  (about  zero)  probability  density  function.  Within  this  assump¬ 
tion  an  adaptive  scheme  is  developed,  and  its  performance  and  convergence 
are  verified  via  simulation. 
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1.  INTRODUCTION 


The  detection  of  signals  in  the  presence  of  independent  noise  can  very 
well  be  accomplished  if  the  noise  characteristics  were  fully  known;  in 
particular,  if  the  noise  probability  density  function  is  available,  an 
optimal  detector  can  be  formulated.  However,  it  is  often  the  case  that 
the  noise  physical  mechanism  is  unknown  or  too  complex  to  be  expressed 
in  any  simple  way;  moreover,  the  noise  characteristics  may  not  be 
stationary  in  time  or  space  and  may  otherwise  be  impossible  to  represent 
by  any  fixed  models. 

Under  these  circumstances,  a  different  approach  to  optimal  detection 
is  necessary  which  borrows  the  idea  from  adaptation.  The  adaptation 
process  learns  what  the  noise  actually  is  at  that  moment  and  "adjusts" 
the  detector's  structure  in  a  way  to  result  in  near -optimal  detection 
performance . 

The  contents  of  this  thesis  provide  some  general  guidelines  and 
insights  into  the  design  of  an  adaptive  detector  based  on  quantized  data. 

In  particular,  we  focus  on  the  adaptation  of  the  parameters  in  the 
quantizer  in  a  way  such  that  the  detector's  probability  of  making  errors 
will  be  minimized  with  respect  to  those  parameters.  The  unknown  noise  is 
assumed  throughout  this  thesis  to  be  an  independent,  additive  noise  with 
a  symmetric  (about  zero)  probability  density  function. 
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2.  SOME  BASICS  OF  BINARY  DETECTION  WITH  QUANTIZATION 


2.1  Structure  of  an  Optimal  Detector  Based  on  a  Partitioned  Sample  Space 
In  binary  state  data  commnni cation,  we  transmit  a  positive-valued  or 
negative-valued  signal  depending  on  whether  the  state  is  "1"  or  "0". 

After  the  signal  has  been  transmitted  through  an  additive-noise  channel, 
the  receiver  determines  from  the  observation  whether  it  contains  the 
positive-valued  or  the  negative-valued  signal.  This  problem  can  also  be 
viewed  as  a  binary  hypothesis  testing  problem  in  which  the  hypothesis  Hq 
is  tested  versus  the  alternative  hypothesis  H^;  i.e.. 


Ho  :  Yi  3  -S+Ni 


(2.1) 


H-  :  Y.  =  s  +  N. 

1  i  l 


t  1j2j3j ♦ « * 


where  the  positive  and  negative  signals  are  of  the  same  strength  s  and 
Cn.}  is  a  sequence  of  additive  noise  samples  which  are  independent  and 
identically  distributed  (i.i.d.)  random  variables  with  common  probability 
density  function  f^C*). 

It  is  well  known  that  the  following  detection  scheme  gives  the  best 

receiver  structure  (in  terms  of  the  minimum  probability  of  error)  in 
» 

deciding  Hq  vs.  H^: 


Decide (  H, 


P(X 

if  pcllv 


(2.2) 


H0  or  H1 


where  p(Y|  H^);  i  =  0,1  is  the  probability  density  or  mass  function  of  Y 
(X  "  CY1Y2Y3...Yn))  given  that  H^  is  the  true  hypothesis  and  t  is  the 


3 


threshold  to  which  the  probability  ratio  is  compared.  The  ratio 
P<x!h1)/p(xIh0)  is  often  called  the  likelihood  ratio  and  is  a  function  of 
the  observation  vector  If  we  assume  the  two  hypotheses  HQ  and  are 
equally  likely  to  occur,  that  is,  they  have  equal  a  priori  probability 
(PrflL]  =  k  1  i  =  0,1),  and  that  the  penalties  on  incorrect  decision  under 
either  hypothesis  are  the  same  (no  penalty  on  correct  decision)  then  the 
best  t  is  1. 

Since  the  logarithmic  function  is  monotonically  increasing  with  its 

argument,  we  can  write  (2.2)  in  a  different  but  equivalent  way, 
f 


decide 


H. 


H, 


if  log 


p(z|Hx) 

p(ziV 


>  log  T 


(2.3) 


H0  or  H1 


where  the  function  log(p(£|H^)/p(^|Hg) )  is  often  called  the  log-likelihood 
ratio  function  of  the  observation  y\ 

In  the  problem  to  be  considered  here,  we  take  a  fixed  finite  number 
(n)  of  observations  to  determine  whether  Hq  of  has  occurred  and 
partition  the  observation  space  (the  real  line  in  this  case)  into  a 
finite  number  (m)  of  intervals.  Then  the  detector  structure  of  Eq.  (2.2) 
gives  an  optimal  m- level  quantizer-detector  with  m  preset  partition 
intervals . 

Let  n.  be  the  number  of  samples  from  observations  fy.}?  .  that  fall 
i  r  wi  i=l 

in  the  i-th  interval,  and  let  p?  and  p^  be  the  probabilities  that  the  value  of 

the  observation  belongs  to  the  i-th  interval  under  hypotheses  Hq  and  H^, 

m 

respectively.  Hence  Z  n^  =  n,  the  total  number  of  observations,  and 
m  n  m  1  i=l 

E  p,  *  1  and  E  p.  =1. 

•  i  i  .  i  i 

i=l  i=l 


With  the  partitioning  of  the  observation  space,  the  probability 


distributions  of  n  =  (n^n^n^ . • .n^)  given  hypotheses  Hq  and  are 
multinomial,  i.e.. 


,  .  n'.  ,  0.nl,  0.  n2,  0.n3  .  0.nm 

p(alV  ■  n?.  (pi>  (p2>  (p3>  •••(p„> 


ra 


(2. 


,  ,  n'.  ,  l*“l.  1,  V  Ln3  ,  l.nm 

P(-IH1>  ‘  njinjlnj'.'.-'-nJi  <P1>  <p2>  <P3>  '  ' '  <p„> 


m 


Now  Eq.  (2.2)  can  be  written  as 


H, 


/I  ?l/l  \2/  l  \n3  /l  < 


/  P 


decide  H, 


if,'  -r 


i\Vl 


/ 


!  A*0  Ap? 


\p 


Hq  Hx 


'  m 
0 


>  1 


(2. 


m 


Note  that  r  has  been  taken  as  1  for  the  remainder  of  this  thesis.  Again 
since  the  logarithmic  function  is  nondecreasing,  (2.5)  has  the  following 
equivalent  form, 


H, 


decide  \  i 
H 


•f  ?  ,  /M 

Lf  i=i  "logvA 


(2. 


‘1 


Pi 


We  arbitrarily  incorporate  the  decision  on  when  the  test  statistic 


m  <  P. 

— jr  )  equals  zero;  this  will  not  affect  the  error-probability 


performance  of  the  detector  with  equally  likely  Hn  and  H. . 
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2.2  The  Optimal  Quantizer-Detector 

An  m- level  quantizer  Q(»)  is  characterized  by  the  two  sets  of 

numbers,  q  =  (q,q,,q0...q  )  (the  quantization  levels)  and 
1  2  3  m 

t  *  (-<*>  =  t_  <  t,  <  . . .  <  t  ,  <  t  =  ®)  (the  breakpoints)  which  partition 
u  i  m- l  m 

the  real  line  (i.e.,  the  observation  space)  into  m  intervals.  The  function 

of  the  quantizer  is  to  set  Q(y)  =  q^  if  the  observation  y  is  such  that 

t^  —  y  <  t^+^  as  shown  in  Fig.  1. 

Since  [N. }n  .  are  i.i.d.  random  variables  and  Y.  *  +  S  +  N. 

l  i=l  i  —  l 

i  ~  1,2,3, . . . ,n, 


P(i|H1)  n 

108  p(zlH0)  =  iti  108  P<^il  V 


p^l^) 


(2.7) 


This  equation  (2.3)  can  be  written  as 


H, 


n  p(y,  |ht) 

decide  t  if  2  log  - rr-r 

H,  1-1  P<!,i  IV 

{  1 


(2.8) 


Here  again  we  have  incorporated  with  the  decision  on  when  equality  holds 
in  the  comparison.  If  the  log- likelihood  ratio  function  of  the  observation 
in  (2.8)  is  quantized,  we  obtain  a  quantized  version  of  the  detector; 
i.e., 

H°  n  /  /  p<y.  !«,> \\< 

declde  U  i£  0  <2-9> 
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This  can  be  reformed  into  the  following  by  the  property  of  the  quantizer 


Q(  *) » 


decide 


H, 


H, 


if 


m 

,£,Vk 

1=1 


< 


(2.10) 


Now  we  obtain  the  structural  form  of  the  general  quantizer  detector  of  this 
problem. 

Since  (2.6)  gives  the  best  detector  with  predetermined  partitions 
of  the  observation  space,  it  can  be  easily  recognized  that  if  we  set  the 
q^  in  (2.10)  to  be  log(p^/p^),  k  =  l,2,3,...,m,  the  resulting  m-level 
quantizer,  Q^(*)>  with  the  same  prechosen  breakpoints,  is  optimal  in 
detection  performance. 

The  following  result  shows  the  equivalence  of  the  test  in  (2.10) 
with  qfc  =  q^  =  log(p^/p^)  and  the  test  in  (2.3)  with  t  =  1  as  m,  the 
number  of  quantization  levels  goes  infinite. 

Property:  Given  the  following  conditions  on  a  binary  hypothesis  testing 
problem, 

(1)  under  either  Hq  or  H^;  are  i.i.d.  random  variables; 

(2)  the  cumulative  distribution  functions  of  under  both  Hq  and 
(Fy(y| Hq)  or  Fy (y)H^) ,  respectively)  are  continuously  differentiable 

and  strictly  increasing, 


(2.11) 


where  Y  =  (Y^Y^Y^...  Y^) ,  p^  is  the  probability  that  Y  is  in  the  k-th 
interval  given  H^,  i  =  0, 1  and  n^  is  the  number  of  samples  from  {y^}?_ 


that  fall  in  the  k-th  interval. 


y  ,  /Pk\  "  /'’k(1)N 

Z  alogf-j  •.  -  I  log (-5—  ) 

k=l  \  Pu  /  1=1  \Pv(l)/ 


where  P^(i)  =  probability  that  y  belongs  to  the  k-th  interval  under  H 
Since  pj^(i)  >  0  and  the  logarithmic  function  is  continuous 


m  /^k\  n 

lim  £  n,  log  f  —  ]  =  £  log 

a-»k=l1C  \p  ?J  i-1 


u"  0  > 

m  -  ®  Pi,  (i)  ! 


By  L'Hopital's  rule 

pjtt)  pty^Hp 

»  p°(i)  '  XTV 


Hence 


£  log  J  lim  — - \  =  £  log  i  — r— y  ) 

i=L  |m  -  »  p  (i)  I  i"l  V P(y^ |Hq)  J 


X  Xl08(;°)  ■ 


n  p<y1!H1) 
' 108  ^ 


/POdV 

1osVp(HV 
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2.3  The  Locally  Optimum  Quantizer-Detector 

In  Che  detection  of  weak  signals,  the  maximum  power  slope  at  s  =  0 
is  an  appropriate  criterion.  It  can  be  shown  that  the  locally  optimal 
test  statistic  T^q(^)  for  the  binary  hypothesis  testing  problem  of  (2.1) 
can  be  obtained  by  differentiating  the  log- likelihood  function  in  (2.3) 
with  respect  to  s  and  setting  s  *  0, 


-  .V 

1=1 


(2.16) 


where  g^Q (y)  *  -f^(y)/fN(y) .  The  generalized  Neyman-Pearson  lemma  asserts 
that  test  with  T^  (•)  as  a  test  statistic  mazimizes  the  power  slope  at 
s  *  0  over  all  tests.  The  corresponding  locally  optimal  quantized  test 
statistics  with  a  given  set  of  breakpoints  _t  is  (see  Kassam  [2]) 

n 


V2)  •  .V 

Q  1=1 


(2.17) 


where  Q^q  (y)  is  the  optimal  quantized  version  of  S^0  (y)  in  the  minimum-mean- 
squared-error  sense  (under  Hn)  with  breakpoints  _t,  and  it  can  be  shown  to  be 


Qio(y) 


A  to 


(2.18) 


where  FN(*)  is  the  noise  cumulative  distribution  function.  As  before,  (2.17) 
can  be  reformed  to  give 


T  0(2> 
Q 


m  . 

Z  2  \\ 
k=l  K  * 


(2.19) 
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Similar  results  can  be  obtained  by  differentiating  the  optimal 
quantized  test  statistic  of  (2.10)  with  respect  to  s  and  setting  s  *  0, 


/  fN('tk)~fN(t:k-l)  \ 
nkVFN<'tk-l)"FN(tk)  / 


(a  n  a\ 


is  also  the  locally  optimal  quantized  test  statistic  with  the  same 
breakpoints  _t.  Again  the  generalized  Neyman-Pearson  lemma  asserts  that 
a  quantizer  detector  with 


as  its  test  statistic  maximizes  the  power  slope  at  s  *  0  over  all  quantizer 
detectors  of  the  form  in  (2.10). 

It  is  known  [2]  that  the  optimal  breakpoints  t_  in  the  weak-signal 
case  can  be  obtained  by  solving  the  following  two  sets  of  equations, 


lo  ^ 

^  FN(tk-l)_1W 


k  =  1,2,3,  . . .  ,m 


(2.21) 


and 


lo  lo 
qk  +  qk+l 


2 


8Io 


k  =  1,2,3, . . . , m- 1 


(2.22) 


where  g^O^)  =  -  (tk)/fN(tk)  ’ 
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2 .4  Some  Properties  on  the  Asymptotically  and  Locally  Optimal 

Quantizer-De  tec  tor 

Finally  we  want  to  show  that,  as  the  signal  strength  s  -*  0  and  the  sample 
size  n  —  ®  simultaneously,  the  set  of  breakpoints  _t  which  minimizes  an 
upper  bound  to  the  probability  of  error  in  detection  using  the  locally  optimal 
quantization  levels  of  (2.18)  approaches  the  set  of  locally  optimal  break¬ 
points  given  by  (2.21)  and  (2.22).  But  first  we  need  two  properties  which 
give  an  upper  bound  to  the  probability  of  error  and  the  symmetrical 
property  of  the  optimal  breakpoints  minimizing  the  probability  of  error  in 
detection  of  the  problem  depicted  by  (2.1)  with  equal  a  priori  probabilities 
for  Hq  and  and  symmetric  noise  density  function. 

Since  PrCHg}  =  Pr{H^}  =  the  probability  of  error  in  detection  using 
(2.6)  is 


Pe  =  *  l  1  PrtalHo3  +  i  l  PrCsK3  - 


(2.23) 


where 


n  €  N 


Pr (n | H . 3 


n  €  N„ 


•  .  n—  •  n 

1\  1/_1\  2,3.*  3  /_1\  m 


12  3  m 


Ni  =  Ca  ■ 


—  (pp  ‘(Pi)  -(P;>  -...(P;> 

i 


. . ,n  )  s . t. 
in 


and 


c-l  Vpk/ 

1 

m  /pk  \ 

N2  =  Cn  -  (n1,n2,n  , . . .  ,nffl)  s.t.  E  ^log  (  —  )  <  0} 

k“l  \Pfc/ 


Notice  that  pj^'s  are  functions  of  the  breakpoints;  thus,  within  smoothness 
conditions,  the  partial  derivative  of  ?Q  with  respect  to  the  breakpoints 
yields  a  necessary  condition  on  the  optimal  breakpoints  [1], 


1  -  a.  -  ~  -  -  -  A  -  » 
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k  2  1  ^  Cn.  |hq} 

n  €  Nx  L  p  . 


+  \  L 
n  6  N, 


Cn|H1}  — j-  - 

Ip. 


n . 

-1  . 

0 

0 

Pi 

Pj+1 

n . 

— L  . 

V* 

1 

1 

PJ 

Pj+1 

fN(tj+s) 


fN(tj's) 


(2.24) 


In  general  it  is  necessary  to  use  a  gradient  search  technique  to 
solve  for  the  optimal  breakpoints  with  (2.24),  since  it  is  not  likely  that 
a  closed  form  solution  to  this  equation  can  be  found. 

The  following  property  gives  the  symmetric  property  on  J:;  thus  half 
of  the  breakpoints  can  be  determined  from  their  symmetric  counterparts. 
Property:  Given  the  following  conditions 

(1)  The  observations  ,Y3  ’  ’  *  '  *^n>  are  random  variables  . 

(2)  Pr CHq}  =  Pr [h^}  =  i.e.,  and  Hq  are  equally  likely  to  occur. 

(3)  The  cumulative  distribution  functions  of  the  observation  y  under 

Hq  and  Hr  Fy(y|H0)  and  (y j H^)  are  continuous  and  symmetrical  in  the 

sense  that  Fy(-y| Hq)  -  1  -  Fy(y | H^) • 

(4)  _t  *  ^ci*  C2  ’  C3’ ‘  ’  Cm-1^  are  synmfitric  breakpoints  in  the  sense 

that  t  ,  *  -t.  j  *  1,2,3, ...,m-l. 

*  6Pe 

(5)  The  j  component  of  _t  is  optimal  so  that  with  (2.24)  j—  *  0. 

th  ^e  I  j  — 

Then  the  (m-j)  component  of  £  is  also  optimal;  that  is  ^ - j  *  0. 

m-jj  t 

Proof:  Given  that  Fy(y|H^);  i  =  0,1  and  _t  are  symmetrical,  the  following 
four  statements  are  true, 


(a)  fy(tj  I  Hq)  =*  fY(tm_j|H1) 
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Q .E .D . 


14 


From  this  property  we  can  see  that  the  optimal  quantizer  is  odd  symmetric, 

1  0 

0  ,  P-k\  .  /"  Pk  \  0 


0-k  ■  108  ^f)"  108  ("T  )  ■  -Ok 


(2.25) 


The  next  result  gives  the  upper  bound  to  the  probability  of  error  in 


detection  of  (2.6). 


m 


Property:  Given  the  test  statistic  t  »  2  n^^log  ~ ^  ,  with 

k’1  pk 

e{t|hq}  <  0  and  eCt)^}  2  0,  and  PrtHQ3  «  PrO^}  =  % 

the  probability  of  error  of  the  test  (2.6)  is  upper  bounded  by 


-i  fi 

n  r  m  P,  \  tu  -,  *•  n 

►  +  \  inf  <  2  ,•  —  )  p  i 

(J)  (l  k«l  V  p°  /  j 


(2.26) 


where  n  is  the  sample  size  and  <u  is  real  valued. 

Proof:  If  e£t|Hq}  <  0  and  eCt|h^3  £  0,  then  by  applying  the  Chernoff  Bound 


to  P  , 
e 


Pg  <  ^  inf[G  («>|h0)1  +  %  inf[G  (o)(H  )]  , 

CD  (JO 


(2.27) 


where 


pe  *  2  i  \  Pr [n|  Hq}  +  2  ^Prln^} 


n  €  N,  w  n  €  N„ 


=  [a  s.t.  T  s  0}  and  ^  *  {n  s . t .  T  <  0}  ,  and 

G^, (cu  [ H j )  is  the  moment  generating  function  of  the  random  variable  T  given 


hypothesis  HL  ;  j  =*0,1  which  is  given  as 


Gt(«)|H  ) 


m  /  \ w  1  i  n 

Z  (  ~  ^  Pk  I  J30-1 

L  k"1  \  p k  /  J 


(2.28) 


Substituting  (2.28)  into  (2.27)  gives  (2.26). 


Q.E.D, 


By  the  property  on  symmetry  of  t_  and  the  fact  that  q  =  -q)  ,  we 

“  K.  rC 

can  consider  only  the  positive  half  of  the  entire  partition  and  rewrite 
(2.26)  as 


(2 


where  at"  =  m/2. 


The  minimizing  values  of  ou  can  be  found  for  the  first  and 


second  terms  in  (2.29)  by  differentiating  them  with  respect  to  ou  and 
setting  them  to  be  zero  separately,  thus  for  k  =  l,2,3,...,m 


(2.30)  and  (2.31)  give  w  =  \  and  cu  *  as  the  minimizing  values  for 
the  first  and  second  terms  in  (2.29),  respectively;  thus 


( 
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The  quantity  on  the  right  is  Hellinger's  integral  for  the  partitioned  data 
sequence.  The  design  of  quantizers  in  terms  of  this  latter  quantity  has 
been  considered  by  Poor  and  Thomas  in  [6] . 

To  study  the  performance  of  a  fixed  sample  size  detector  for  weak 
signal  or  equivalently  for  large  sample  size  n,  it  is  usual  to  assume  that 
the  signal  strength  s  is  of  the  order  of  lA/n  since  n  is  a  parameter  under 
control.  So  let  s  =  K/Vn,  where  K  is  any  positive  constant,  so  that  both 
s  -*  0  and  n  -*  «  at  the  same  time.  However,  with  s  =  K/Vn,  the  bound  in 
(2.32)  approaches  l00,  an  indeterminate  form  as  n  -*  =°.  Applying  the 
L'Hopital  Rule  twice  to  the  bound  in  (2.32)  we  can  obtain  an  "asymptotic" 
upper  bound  to  as  s  ■*  0  and  n  -♦  ®,  which  is 


*  r 
m  I 


exp 


s  -»  0 
n  -»  ® 


-  K  £ 
k-1 


f  \2  . 


jj 


(2.33) 


A  detailed  derivation  of  (2.33)  is  in  the  Appendix. 

We  now  try  to  obtain  a  set  of  breakpoints  which  will  give  the  smallest 
possible  upper  bound  to  Pg  in  the  case  of  weak  signal  and  large  sample  size 
by  taking  the  derivative  of  the  bound  in  (2.33)  with  respect  to  t^.  It 


turns  out  that  t,  has  to  satisfy  the 
k 


equation 


-sU(f.(t-  )  f>(t,)\  (fN(tk)"fN(tk-l)) 

d^^W  W;  rH(tk)-FH(tk.1) 


(£N(tk-H)~£N(tk)  )  j  _ 

^^k+P’W  J 


This  becomes 
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“fN^Ck) 


h 

FN<-tk)_FN(tk-l) 


2V£N(-tk)"£N(t:k-lV£N(tk)  i£N(l:k->'‘£N(tk-l)  £N(V 


FN<'Ck)“FN(tk-l) 


+fN(V  + 


'(£N(tk+l)"£N(tk)/'  /  .  .  v  (£N(tk+l)"£N<:tk))  f  . 

VW-VV  v  N  k  7  v 


=  0 


(2.34) 


From  (2.34)  and  with 


£N<‘tk-l)~£N(‘1:k') 

,k  'w-vw 


k  =  1,2,3, . . . , m- 1 


(2.35) 


the  breakpoints  _t  can  be  determined  by  the  following  set  of  equations 


fN(Ck) 


k  -  1,2,3, ...;m-l  (2.36) 


Surprisingly,  this  is  the  same  set  of  equations,  (2.21)  and  (2.22), 
necessary  for  _t  to  be  locally  optimal.  Hence  we  can  conclude  that,  with 
respect  to  the  breakpoints  _t,  the  upper  bound  to  the  probability  of  error 
in  detection  of  an  optimal  quantized  test  given  in  the  above  theorem, 
minimizes  simultaneously,  as  s  0  and  n  -•  ®,  with  the  inverse  of  the 
test's  power  slope  at  s  -*  0  which  is,  as  noted  earlier,  an  appropriate 
criterion  for  detection  of  small  signal  instead  of  the  test's  power. 

This  result  is  a  special  case  of  that  obtained  by  Poor  and  Thomas  in  [6]. 


3.  THE  PROBABILITY  OF  ERROR  OF  THE  4-LEVEL  QUANTIZER-DETECTOR 


3 . 1  Structure  of  the  4-Level  Quantizer  Detector 

From  earlier  discussion,  we  conclude  that  an  optimal  m-level  quantizer 
detector  for  arbitrary  signal  strength  s  is  given  by 


fHo  . 

decided  if  Z  olog  — jr 
^  k=i  \P£ 


(3.1) 


where  p^  ®[F^(t^  -  (-1)  s)  -  -  (-1)  s)]  for  i  =  0,1,  and  t^, 

k  =  1,2,3, ... ,m-l  have  to  satisfy  the  set  of  equations 


_ e 

&Ck 


%  Z  Pr{n/HQ3 
n  € 


+  k  Z  ^  PrCn/H^} 
n  €  N2 


\  Vi 

0  -  0 

Pk  Pk-1 


\  Vi 


Lpk  pk-lJ 


£N(tk+S) 


WS>  ’  0 


1 

1  111  /  \ 

where  =  [  n  =  (n^n^n^,  . .  .  ,nm)  s.t.  Z  n^  logi  —  J  s  0] 

k=l  \P,.  / 


(3.2) 


and 


m  Pk 

N-  =  [n  »  n1,n  .n-, . . . ,nm)  s.t.  Z  nfe  log /  —  \  <  0} 

k=1  \pk/ 


But  (3.2),  as  mentioned  before,  generally  does  not  have  a  closed  form 

solution  to  t^  and  it  can  only  be  solved  by  some  root-searching  technique. 

However,  from  the  property  on  symmetry  of  t_,  we  can  easily  see  that,  for  m 

even,  t  =0  and  t.  =  t  ..  So  there  are  only  (?  -  1)  different  t,  's 
m  j  m-j  J  2  k. 

2 

left  to  determine  as  the  others  can  be  set  according  to  the  property. 
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Only  when  m  =  4  are  the  root  searching  techniques  one-dimensional, 
and  we  will  confine  ourselves  only  to  this  case  for  the  rest  of  this 
thesis,  although  it  is  conceptually  as  simple  to  use  some  higher-dimensional 
searching  techniques  to  locate  all  the  (;p  -  1)  t^'s  in  (3.2)  for  m  ^  6 
(Note :  m  is  taken  to  be  even) • 

Once  t_  is  set  from  some  searching  methods,  its  corresponding  optimal 
quantization  levels  q,  and,  consequently,  the  optimal  quantized  test 
statistic,  are  determined.  With  symmetric  noise  density  function  f ^ C * ) , 
the  odd  symmetric  property  of  q  implies  that  (3.1)  can  be  rewritten  as 


decide 


H, 


H, 


m/2 

i£  „2,  (wVvi»i 

k=l 


< 

0 

a 


(3.3) 


with  m  =  4,  (3.3)  becomes 
'  H_ 


decide 


i£  ■  V'WV'b-k 

k=l 


H, 


(3.4) 


We  can  see  that  normalizing  the  test  statistics  in  (3.4)  with  q^ 
has  no  effect  on  the  quantizer  detector  performance,  hence 


H, 


decide  : 


if  (n^-n1)qr  +  (n,j-n2) 


(3.5) 


\  H, 


where  q^  =  q^/q^.  Also  with  m  =  4,  t^  =  -t^  and  t2  =  0,  (3.2)  can  be 


written  as 
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r  n 


I 
n  € 


Pr 


n|HQj 


L  P-! 


n2** 

~  i  fN(t3+s) 

Po  J 


+  h 


E  Pr  [  n|H  ] 
n  6  N2  1 


fN(t3“s)  "  ° 


(3.6) 


with  N*  =  [n  =  (n1,n2,n3>n4)  s.t.  (n4-n1>qr  +  (n3»n2)  ^  0] 
and  N2  =  [n  =  (n^^ ,n3 ,n  )  s.t.  (n^-n^q^  +  (n3-n2)  <  0] 

P1  =  FN(-t3  -  S)’  P2  =  FN(±  S)  "  FN(_C3  -  s)>  P3  =  FN(t3  -  s)  “  FN(±  s)’ 
P4  ■  1  -  fn(C3  ±  s)  with  ±  for  i  «  °  . 


(3.6)  is  a  function  of  t3  only  and  many  one-dimensional  root-seeking 
methods  can  give  a  solution  to  t_ . 

For  the  reason  given  in  the  following  chapter,  we  prefer  and  will 
use  instead  of  (3.6),  the  probability  of  error  itself  in  solving  t3>  i.e., 

pe(t3)  ~  %  2  i  Pr{n|HQ}  +  h  E  PrCnjH^  .  (3.7) 

n  €  n€N2 


Again  using  any  one-dimensional  "peak”  seeking  methods  on  (3.7),  t3  can 
be  located  as  well. 

3.2  Formation  of  the  PS°(t,)  and  P°(t_)  Curves 

S  j  6  j 

SO 

Pg(t3)  can  be  plotted  versus  t3  in  two  different  ways.  Pg  is  the 

curve  plotted  with  q^,  which  determines  and  N2  in  (3.7),  held  fixed; 

this  implies  the  same  and  N2  are  used  in  calculating  Pg(t3)  for  all  t3 . 
.  so 

Minimizing  Pg  with  respect  to  t3  corresponds  to  the  minimization  of  the 
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probability  of  error  by  moving  t^  around  until  the  minimum  probability  of 
error  is  attained  with  quantization  level  ratio  fixed  all  through  the 
process.  Obviously,  the  t^  so  obtained  is  optimal  only  for  the  class  of 
quantizer  detectors  using  that  particular  level  ratio. 

SO 

Several  Pg  (t^)  curves  are  shown  in  Fig.  2  for  different  values  of 
qr  (with  Gaussian  and  Cauchy  noise).  It  can  be  seen  that  for  any  t^ 
there  is  a  corresponding  q  which  gives  the  minimum  value  to  the  probability 
of  error  at  that  particular  t^ •  Since  q^  =  log(p^/p^)  gives  an  optimal 
quantizer  with  breakpoints  _t,  therefore  the  optimal  corresponds  to 


each  t^  is 


i  J  1~FN(-t3~s)  I  /  J  FN(t3~s)  "  V-S) 
’r  '  \  l-FH(t3«)  /  °8VN(t3+S)  -  Fn(s) 


^  *  “»  |l-PK(tj+s)y  “•  |Fn(VS)  -  J  (3 -8> 

P°(t^)  is  then  the  curve  which  picks  off  the  minimum  of  all  the 

probabilities  of  error  over  all  possible  q^  at  each  t^;  it  is  shown  in 

Fig.  3  and  in  Fig.  2  along  with  the  (t^)  curves.  Obviously  P°(t^)  *-s 

SO 

the  greatest  lower  bound  to  all  possible  Pg  (t^)  at  each  t^  and  consequently 
the  curve  P°(t3)  always  stays  on  or  below  all  P®0(t3)  curves.  On  the 
other  hand,  P°(t3)  can  also  be  obtained  analytically,  at  each  t^,  by 
evaluating  (3.7)  with  q^  from  (3.8)  for  every  t^;  hence  the  sets  and  N3 
are  different  for  every  different  t^ .  The  t^  so  obtained  by  minimizing 
P°(t3)  will  yield  a  truly  optimal  quantizer  detector.  However,  as  the 
number  of  possible  elements  n  in  and  N3  gets  large  for  large  sample 
size  n,  the  necessary  search  for  n  =  (n^.n^jn^jn^)  in  and  for 
every  t^  will  become  time-consuming. 


0.035 


Gaussian 


3.3  Some  Characteristics  of  the  PS°(t_  and  P°(t0)  Curves 
_ e  3 _ e  3 _ 

so  o 

After  the  formulation  of  P  (t_)  and  P  (t0)  have  been  considered, 

e  3  e  3 

we  now  turn  to  discuss  their  characteristics  from  the  observations  of 
their  curves  in  Fig.  2.  First  we  notice  that  for  some  t^  there  is  a 
range  of  q  values  that  give  rise  to  the  same  probability  of  error;  that 
is,  for  some  t^  in  which  the  probability  of  error  is  insensitive  to  certain 
range  of  q^  values.  This  can  be  better  illustrated  by  plotting  the 
probability  of  error  versus  the  level  ratio  q^  given  t^ .  As  shown  in 
Fig.  4-Fig.  7,  each  of  these  curves  is  actually  a  series  of  steps  and 
the  width  of  each  step  corresponds  to  the  range  of  q^  which  gives  equiv¬ 
alent  probability  of  error  at  that  t^  •  From  these  figures  (Fig.  4-Fig.  7) 
it  is  clear  that  for  every  t^,  the  probability  of  error  depends  only  on 
the  ranges  of  q^_  (i.e.,  it  is  a  function  of  the  ranges  of  q^_  only)  and  not 

SO 

on  the  actual  q^_  values.  This  is  because  for  a  given  t^,  Pe  (t^)  depends 
on  qr  through  the  sets  and  and  with  the  sample  size  n  finite,  there 
may  be  a  range  of  values  of  q^  which  gives  rise  to  the  same  sets  of 
and  Although  the  value  [(n^-n^)qr  +  (n^-^)]  itself  changes  for 

every  different  qr>  the  two  sets  of  n  =>  (n^.n^jn^.n^)  that  give  and  ^ 
such  that  [(n^-n^)qr  +  (n^-^)]  ^  0,  respectively  may  be  invariant  under 
different  q^.  We  expect  the  sets  and  to  be  more  distinguishable 
for  different  q^  and  the  staircase-like  curves  in  Fig.  4-  Fig.  7  to 
smooth  out  as  n  gets  large. 

Next  we  notice  that  for  large  enough  qr  and  fixed  n,  the  probability 
of  error  is  independent  of  qf  for  every  t^.  This  can  be  seen  from  Fig.  2 
or  better  from  Fig.  4-Fig.  7  where  the  last  step  extends  all  the  way  from 
q^  =  10  given  any  t^,  this  is  due  to  the  fact  that  the  sample  size  is 


se  SNR  *  0.75,  t0  =  0.05 


Prof,  of  Error  0.093 


Figure  4b.  Probability  of  error  vs  q  for  Cauchy  noise  SNR  =  0.75,  =  0.05, 
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Figure  6a.  Probability  of  error  vs  q  for  Gaussiai 


Prob.  of  Error  0.035 


Figure  7a.  Probability  of  error  vs  q  for  Gaus 


Figure  7b.  Probability  of  error  vs  q  for  Cauch 
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finite  causing  and  N0  to  be  invariant  under  Large  q^.  Again  it 
will  occur  at  larger  qr  when  n  gets  larger. 

SO 

The  insensitivity  of  P^  (t^)  to  q^  becomes  significant  for  small  t^ . 
As  shown  in  the  staircase-like  curves  for  small  t^,  the  probability  of 
error  is  roughly  constant  for  all  q^  values  greater  than  one.  Thus  the 
P°(t3)  curve  almost  coincides  perfectly  with  all  the  P^° ( t^ )  curves  for 
small  t^  values  (see  Fig.  2)  and  hence  the  minimum  points  of  the  P°(t3> 

SO 

and  P  (t_)  curves  are  found  located  close  to  each  other.  These 
e  3 

characteristics  of  the  curves  have  a  very  important  implication  on 
adaptation  discussed  later. 

Since  the  probability  of  error  given  t^  depends  only  on  the  range 
of  for  finite  n,  there  must  be  a  range  of  q^_  values  that  gives  the 
same  minimum  probability  of  error,  tet  us  denote  this  range  as  an 
optimal  range  of  q^_  values;  it  is  quite  obvious  that  the  value 


must  fall  in  the  optimal  range. 


For  example,  from 


Fig.  2,  the  optimal  breakpoint  t^  is  shown  to  be  about  0.4  for  Cauchy  noise 


when  s  =  0.75,  the  q 

nr 


which  is  within  the  optimal  range  as  seen  in  Fig.  4.  As  mentioned 
earlier,  the  staircase-like  curves  smooth  out  as  n  gets  large  and 
eventually  (n  -  ®)  the  optimal  range  for  q^  will  collapse  to  a  single 


point  of  value  log 


/P 


(  -5  4) 

n  \p3  / 


Before  leaving  this  section,  two  main  characteristics  of  Pg  (t^) 

are  worth  pointing  out  again;  i.e.,  given  the  signal  strength  s,  the 

probability  of  error  is  insensitive  to  for  any  t3  below  a  certain  value; 

and  the  probability  of  error  depends  only  on  t^  for  any  qf  above  a 

critical  value  which  was  mentioned  earlier.  This  critical  q  is  related 

r 

to  the  sample  size  n.  If  the  sample  size  n  is  finite,  for  quantizers 
with  4  levels,  it  is  easily  seen  from  the  definitions  of  and  N^, 

N*  -  C(n1,n2,n3,n4)  s.t.  (n^n^q^  +  (n3-n2)  £  0} 

N2  =  ((n^n^n^n^)  s.t.  (n^-n^q  +  (n3-n2)  <  0} 

that  any  q^  greater  than  or  equal  to  n  will  definitely  give  the  same  sets 
1  SO 

of  N  and  N0;  hence  P  (t~)  is  identical  for  all  q  s  n.  Note  that 
1  z  e  J  ^r 

Figs.  2-7  are  created  with  sample  size  n  =  10. 

3 .4  Some  Considerations  on  the  Design  of  the  Adaptive  Quantizer  Detector 

From  previous  discussion,  we  point  out  that  the  minimum  points  of 

the  P°(t_)  and  PS°(t_)  curves  are  situated  closely  to  each  other;  hence 
e  3  e  3 

if  we  are  willing  to  suffer  a  little  more  probability  of  error  near  the 

minima,  we  may  just  as  well  consider  pS°(t.)  instead  of  P°(t„)  since  P°(t0) 

e  J  e  3  e  3 

will  (as  noted  above)  take  much  more  processing  time  than  pS°(t  ). 

SO 

Now  it  becomes  necessary  to  decide  which  P  (t„)  curve 

e  3 

(correspondingly,  which  q^)  to  work  on.  However,  there  is  a  rule  of  thumb 
in  picking  q^  condensed  from  the  previous  descriptions  on  the  general 
characteristics  of  the  curves,  which  gives  a  guaranteed  performance  for 
the  adaptive  quantizer  detector. 


3 


If  is  chosen  such  that  it  is  strictly  less  than  n  and  greater 

than  1,  we  are  guaranteed  that  the  performance  of  the  adaptive  quantizer 

detector  is  better  than  the  worst  possible  performance  with  that 

particular  noise.  This  is  because,  for  small  t.,  PS0(t.)  is  roughly  the 

same  for  almost  every  qr,  and  for  large  t3 ,  P®°(t3>  is  increasing  with 

qr  but  upper  bounded  by  that  value  of  P®°(t3)  with  q^  s  n.  So,  using  any 
so, 

Pe  1  <  <lr  <  n  will  have  performance  always  better  than  the 

lower  bound  performance. 

It  appears  from  the  curves  that  the  smaller  the  q  used,  the  better  the 

adaptive  quantizer  detector's  performance  will  be;  however,  we  note  from 

Fig.  2  that  q^  =  2  gives  the  largest  minimum  probability  of  error  over  all 

q^  for  the  Gaussian  noise  case  though  the  performance  during  the  adaptive 

process  is  almost  the  best  we  can  get.  Since  the  noise  is  unknown  to 

the  detector,  we  really  do  not  have  any  good  guess  on  the  initial  t  to 

start  our  iterative  process  for  adaptation.  If  we  start  with  "small" 

initial  t  ,  then  it  does  not  really  matter  which  q^  we  use  since 

performance  of  the  adaptive  process  is  insensitive  to  q^_  in  the 

range  of  small  t^ .  But  if  our  initial  choice  on  t^  turns  out  to  be 

"large",  we  have  a  tradeoff  between  better  performance  with  smaller  q 

nr 

and  faster  convergence  to  the  final  optimal  operating  point  with 
larger  q  ,  which  is  due  to  its  relatively  steeper  slope. 

One  might  arrive  at  the  conclusion  that,  if  we  can  start  with 
arbitrarily  small  t^,  we  can  then  forget  about  choosing  q  and  still 
have  both  fast  convergence  and  an  almost  uniform  performance  over  all  q  . 

But  we  simply  cannot  start  with  arbitrarily  small  t~  because,  as  will  be 
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seen  in  Che  simulation,  small  gives  a  very  bad  estimation  on  the 
probability  of  error,  especially  when  n  is  small;  besides,  the 
smallest  initial  t^  that  can  be  used  is  also  dictated  by  the  particular 
iterative  scheme  being  used. 

SO 

Finally  we  observe  from  Fig.  8  that  the  slope  of  the  (t^)  curve 
may  be  steeper  for  smaller  s.  Hence,  the  absolute  amount  of  errors 
saved  from  adapting  the  quantizer  detector  to  its  optimal  operating 
point  may  be  larger  for  smaller  s;  however,  the  percentage  of  improvement 
in  the  probability  of  detection  is  less  as  compared  with  larger  s  in 
adaptation.  Thus  it  depends  on  the  particular  design  objective  whether 
or  not  the  adaptation  process  to  the  optimal  quantization  parameters  is 
worth  doing  for  large  or  small  signal  strength. 

So  far  only  Gaussian  and  Cauchy  distributions  are  considered;  but 
since  they  represent  two  extremes,  we  may  consider  all  these  trends  to 


be  typical. 


Figure  8b.  PS°(t»)  curves  for  Cauchy  noise  with  q 
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4.  THE  ADAPTIVE  DETECTION  SYSTEM 

4.1  Two  Methods  of  Adaptation 

There  are  several  possible  ways  to  adapt  the  quantizer  to  the 
unknown  noise.  For  example  we  may  use  a  training  sequence  of  signals 
which  is  known  to  the  detector  before  transmission;  and  the  method  is  to 
adjust  the  parameters  in  the  quantizer  (i0e.,  breakpoints  and  levels) 
until  a  maximum  number  of  samples  from  the  sequence  are  correctly  "detected." 
This  method  requires  a  certain  idling  period  for  training  before  any 
actual  transmission  and  detection  of  real  data.  This  may  not  be  acceptable 
in  some  cases.  Furthermore  if  the  background  noise  is  time-varying, 
though  it  may  be  changing  very  slowly,  the  training  process  may  be 
necessary  quite  often. 

One  of  the  other  ways  is  to  use  the  method  of  unsupervised  decision 
directed  adaptation,  in  which  the  detector  runs  with  real  data  while  the 
adaptation  of  the  quantizer  is  taking  place.  This  way  the  detector  can 
operate  on  a  full  time  basis  and  can  keep  up  with  any  change  in  the  noise 
up  to  a  certain  time  lag  due  to  the  transient  response  of  the  particular 
adaptation  scheme  being  used  in  the  system.  In  this  method,  every 
decision  made  on  the  real  data  is  assumed  correct  and  is  used  as  a 
training  sequence  for  the  optimal  quantizer  parameter  values. 

The  potential  disaster  of  this  method  is  the  possibility  of  system 
runaway  if  enough  decisions  made  were  actually  incorrect  and  the  modifica¬ 
tion  on  the  quantizer  values  based  on  these  incorrect  decisions  drives  the 
quantizer  away  from  its  optimal  state.  This  results  in  more  errors  in 
decisions.  This  happens  most  likely  in  the  case  when  the  initial 
probability  of  error  of  the  detector  is  large. 


The  structure  of  the  whole  detection  system  in  our  simulation  is 


shown  in  Fig.  9.  The  detection  scheme  follows  the  structure  in  (3.5) 
where  n^  is  the  number  of  samples,  from  an  observation  size  of  10 
samples,  that  fall  in  the  i-th  interval  which  is  characterized  by  the 
value  of  the  breakpoint  t^  (note:  i  runs  from  1  to  4  for  a  4-level 
quantizer  detector).  Then  with  level  ratio  q^,  the  quantity 
(n^-n^)qr  +  (n^-n^)  is  compared  to  a  threshold  (which  is  zero  in  our  case) 
to  make  a  decision  on  which  hypothesis  (Hq  or  H^)  those  10  samples  are 
from,  depending  on  whether  the  quantity  is  below  or  above  the  threshold. 

As  mentioned  in  a  previous  chapter,  additional  complexity  goes  into 

o  so  so 

the  system  when  is  used  instead  of  (t^)  .  Pg  (t^)  is  the 

probability  of  error  as  a  function  of  t^  for  a  fixed  q^  and  hence  the 

sets  and  N2  are  fixed  at  all  times;  while  p°(t^)  requires  new  sets  of 


and  N2  which  correspond  to  the  new  qr 


every  newly  iterated  t^ .  Unless  it  is  necessary  to  go  to  the  true 

optimal  point  of  the  detector  by  using  P°  (t^),  we  will  consider  the 

P°(t  )  case  only, 
e  _} 


In  iterative  procedures,  there  are  two  ways  to  locate  the  optimal 

SO 

t^  which  gives  minimum  Pg  .  One  is  to  find  the  zero  of  the  derivative 

SO 

function  of  Pg  (t^)  with  respect  to  t^.  The  other  is  to  locate  the 

SO 

minimum  of  the  function  Pg  (t3).  In  the  4- level  quantizer-detector. 


44 


the  p‘°(t3)  and  3> 


are  given  as 


Pg°(t3)  =  %  2  L  Pr { r. I Hq}  +  h  2  Pr{nlH1} 


n  €  N 


n  €  N„ 


(4.1) 


and 
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Pr{n|H0] 

n3 

0 

n2 

0 

£N(t3+s) 

n  € 

N1 

Lp3 

P2  . 

+  % 

2 

PrCnjH,} 

•  n3 

1  ' 

n2l 

1 

«s(t3-s) 

(4.2) 

n  c 

N2 

■  P3 

p2J 

where 


iv  1,  ix  2,  iv  3 ,_i.  4 


-  n;i^f^T  «9  *<»;>  *<p;>  '<*;> 


nJ  a  [n  *  (n^n^n^n^)  s.t.  (n4-n1)qr  +  (n3-n2)  ^  0} 


N2  *  [n  *  (n^n^n^n^)  s.t.  (n4-n1)qr  +  (n3-n2>  <  0} 


We  can  see  from  these  equations  that  finding  the  zero  of 


»r<‘3> 

bt, 


involves  an  additional  estimation  of  the  noise  density  function  fN(*); 
it  was  found  that  it  may  not  be  well  approximated  by  any  simple  means. 

<(t3> 

Besides,  the  computation  required  to  obtain  - r- -  is  more  involved  and 

3 

SO 

time  consuming.  So,  in  working  with  Pg  (t3>  directly  and  using  some 
"peak  seeking"  methods  to  locate  its  minimum  point,  all  we  need  is  to 
have  a  good  approximation  of  the  |  i  =  1,2, 3, 4;  j  =  0,1,  which  are 


easily  estimated 


The  adaptation  scheme  used  in  our  simulation  as  shown  in  Fig.  9  is 
decision  directed.  Every  decision  made  from  a  block  of  10  samples  is 
assumed  correct  and,  based  on  this  decision  (Rq  or  H^)  ,  the  10  samples  are 
then  modified  so  that  their  noise  and  signal  content  are  revealed. 
Explicitly,  the  modification  is  as  follows: 


/ 


H, 


N.  =  Y.  +  s 

i  i 


if  decision  is 


i  =  1,2,3, .  . . ,  n  =  10 


H, 


N.  =  Y.  -  s 
x  x 


Hence,  if  all  the  decisions  ever  made  were  correct,  all  the  noise  data 


so  obtained  will  distribute  according  to  the  true  noise  present  in  the 


0  1 

environment.  With  these  noise  data,  we  can  approximate  the  p^  and  p^^ 


i  =  1,2, 3, 4,  necessary  for  the  computation  of  PS°  which  is  going  to  be 


minimized  with  respect  to  t^ .  The  approximation  is  done  in  the  usual 


way;  that  is, 


number  of  xoise  data  from  memory  s.t.  the  value  (noise  data-s) 

0  _  is  in  the  i-th  interval _  ,,  , . 

P 1  .A.  i.  1  HI  link  HH  H  £  H  rt  I  H  H  il  H  l>  H  h  .AHA/J  1  HI  HH  HI  A  H 1 1 


total  number  of  noise  data  stored  in  memory 


number  of  noise  data  from  memory  s.t.  the  value  (noise  data+s) 

1  is  in  the  i-th  interval  ,,  ,-v 

p.  ■ - zzrrr~z~c7T~rr~rrrTz  tttt  _ _ rr~r _ _  (*.5) 


total  number  of  noise  data  store  in  memory 


Notice  that  the  location  of  the  i-th  interval  is  determined  by  the 


current  iterated  breakpoint  t^,  so  p^,  p^  and  hence  P^°  change  accordingly 


with  t^  in  each  iteration.  The  only  way  we  can  update  the  values  of  p^ 


and  p^  is  to  check  through  the  entire  storage  of  the  noise  dat3  and 


perform  the  above  approximation  for  p|?  and  pf'  in  each  iteration.  In 


fact,  this  is  the  most  troublesome  thing  to  do  in  the  whole  algorithm 


in  the  simulation. 


S  O  (Jb  )  (l) 

Once  (t^  ')  is  found  in  the  2-th  iteration  using  t^  ,  the  iterated 

value  of  the  breakpoint  in  the  2-th  iteration,  the  (2+1 )-st  breakpoint 
value  can  be  obtained  by  the  following  iterative  process. 


PSO<t(1)+c  '>-PSO<'t-  (l)-r  ^ 

(2+1)  (2)  -  !Pe  U3  Pe  (t3  V 

’3  3  "x  )  2C . 


(4.6) 


L 


This  is  the  well-known  Kiefer-Wolfowitz  method  in  stochastic 


(2) 


approximation.  With  this,  the  t^  approaches,  as  2  (the  number  of 


iterations)  goes  to  infinity,  the  limit  t^  which  gives  minimum  value  to 


SO 

the  function  p  (t  ).  However,  it  is  necessary  for  the  two  sequences  2 


(the  stepping  sequence)  and  C  satisfy  the  following  conditions  for 

2 


convergence, 
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In  our  simulation,  2^  and  are  chosen  as  1/2  and  1/(42^), 
respectively  and  it  can  be  shown  that  this  choice  of  2 '  and  does 
satisfy  (l)-(4).  With  this.  Equation  (4.6)  becomes 


fpS°  («  +  i  pso  «>.  I 


/  if  1  \ 


/  A  \ 
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As  we  have  mentioned  in  a  previous  chapter,  the  smallest  initial  t^ 

that  can  be  used  is  governed  by  the  largest  element  of  the  C.  sequence, 

1  1  -k 

which  is  the  first  element  (C^  =  £)  *-n  the  case  where  C ^  i  (note: 
this  sequence  is  monotone  decreasing) .  The  reason  is  simply  because  if 
the  initial  t^  was  smaller  than  be  negative  and  P^°  will 

be  undefined.  For  a  different  choice  of  C  sequence  and  hence  a  different 

Xj 

type  of  convergence  behavior,  the  initial  t^  can  be  made  as  small  as 
desired. 


Computer  simulations  of  the  system  in  Fig.  9  are  done  with  Gaussian  and 

1  -x^ 

Cauchy  noises,  their  density  functions  are  f  (x)  =  -  e  /2  and 

N 

f« (2c )  =  — - — rr~  ,  respectively.  The  iterative  scheme  of  (4.7)  is 
N  n(l+x  ) 

used  to  iterate  the  optimal  t^  with  signal-to-noise  ratio  S/N  *  0.75  and 
various  initial  breakpoint  values. 

Figures  10  and  11  show  how  the  iterative  process  has  brought  t^  toward 
its  optimal  values  (in  cases  where  *  0.25  and  2.0  with  qr  fixed  at  2.0 

for  Gaussian  noise).  However,  the  algorithm  is  far  from  converging  even 
after  2500  iterations.  Also,  we  see  from  Figs.  12  and  13  that  the  prob¬ 
ability  of  making  errors  of  the  system  approximates  the  theoretical  values 
after  a  large  number  of  iterations  for  Gaussian  noise  given  in  Fig.  2. 
Similar  curves  for  Cauchy  noise  are  given  in  Figs.  14-17.  Notice  that  the 
curves  in  Figs.  12,  13,  16,  and  17  are  generated  according  to  the  following 
definition 


Probability  of  error  at  f-th  stage 


total  incorrect  decisions  made  by 

system  up  to  the  l-th  stage _ 

total  decisions  made  by  system 
up  to  the  £-th  stage 


Iterated  t„  vs  number  of  iterations  for  Gaussi 


Prob.  of  error 


ure  12.  Iterated  probability  of  error  vs  number  of  iterations  for  Gaussian  noise 
q  =  2.0>  n=10  and  initial  t-i  =  0.25. 


Prob.  of  Error  0.0*r 


Figure  13.  Iterated  probability  of  err 
a  =2.0,  rv“10  and  initial 


Note:  optimum  t  value  is  ^  0.4 


Iterated  t„  vs  number  of  iterations  for 


Prob.  of  error 
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Figure  16.  Iterated  probability  of  error  vs  number  of  iterations  for  Cauchy  noise 
q  =  2.0,  n=10  and  initial  tQ  =  0.25. 


Prob.  of  Error 


igure  17.  Iterated  probability  of  error  vs  number  of  iterations  for  Cauchy 
q  =  2.0  >  n=10  and  initial  t-i  =  2.0. 
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This  way  of  generating  the  probability  of  error  of  the  detection 

system  at  each  stage  can  only  show  that  the  system's  probability  of  error 

does  approach  its  minimal  value  but  in  no  way  indicates  the  system's 

"current  potential"  of  making  errors.  This  "current  potential"  of  making 

so 

errors  by  the  system  is  actually  the  Pg  evaluated  at  the  current  iterated 
breakpoint  t^  '  . 

4.3  The  Modified  Iterative  Procedure 

In  order  to  speed  up  the  convergence  by  a  considerable  amount,  the 
above  iterative  scheme  (4.6)  is  modified  in  the  following  way.  If  the 
sign  of  the  quantity  (P^°  (t^^+C^)  “  *-s  ^^^erent  from  that 

of  the  previous  quantity,  G  will  take  on  the  next  value  (following  the 

Xj 

one  used  by  G  )  in  the  stepping  sequence;  otherwise,  the  same  value 

Xj  *  i. 

used  by  G ^  ^  is  used. 

It  is  necessary  that  C.  be  constant  valued  and  the  stepping  sequence 

1  a. 

be  monotone  decreasing  (in  addition  to  lim  G.  =  0  and  E  G  *  00 )  for 

l  -  «  1  1=1  1 

the  modified  iterative  procedure  be  convergent.  In  our  simulation  with 
the  modified  scheme,  C  »  0.125  and  the  stepping  sequence  is  again  l/l 

Xj 

(harmonic  sequence  is  monotone  decreasing) .  Now,  the  smallest  initial  t^ 
that  can  be  used  is  0.125  in  this  modified  version  of  the  adaptive  system. 
We  use  Table. 1  to  help  illustrate  this  modification. 

Tab le  1 .  Illustration  for  modified  iterative  procedure. 
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Intuitively,  this  scheme  gives  faster  convergence  because  we  put 
large  modification  on  t^  when  the  direction  of  the  search  for  the  optimum 
does  not  change  from  that  of  the  previous  search  and  reduce  the  size  of 
modification  only  when  the  search  direction  changes  which  indicates  an 
overshoot  of  the  iterated  t^  about  its  optimum  and  then  that  we  need  a 
finer  search. 

Figures  18-23  show  t^  converges  to  its  optimal  value  in  a  much 
faster  rate  as  compared  with  those  in  Figs.  10,  11,  14,  and  15  for  the 
same  S/N  (=  0.75),  with  =  2.0  and  initial  t^  =  0.13,  1.7  and  2.2  for 
Gaussian  and  Cauchy  noises. 


optimum 


modified  stepping  sequence)  for  Cauchy  noise  with  SNP 


Mote:'  optimum  t„  value  ia  ta  0.95  which  j.j  not  in  the  acute 


Iterated  t„  (with  modified  stepping  sequence)  for  Gaussian  noise 


64 


B 

■ 


5.  CONCLUSION 

The  simulation  in  the  previous  chapter  shows  that  the  adaptive 
detection  system  does  not  run  away  but  eventually  operates  in  its  optimal 
state,  under  the  conditions  that  the  signal-to-noise  ratio  S/N  *  0.75 
and  sample  size  n  =  10,  with  Gaussian  and  Cauchy  noises.  It  is  expected 
that  the  system  will  work  just  as  well  with  smaller  signal-to-noise  ratio 
level  and  is  left  to  those  who  are  interested  to  try  with  some  other 
signal-to-noise  ratio  levels. 

Comments  on  the  size  of  the  memory  required  to  store  the  noise  data 
is  necessary.  In  the  simulation,  we  store  all  the  noise  data  available, 
which  amounts  to  (sample  size  times  the  number  of  decisions  made)  25000 
storage  locations  in  the  final  stage.  However,  the  actual  amount  of  noise 
data  needing  to  be  stored  can  be  determined  from  the  simulated  curves  in 
the  previous  chapter.  The  general  guidelines  in  deciding  the  storage 
size  are  the  size  of  the  memory  available  in  the  system,  the  time  allowed 
in  processing  the  data  during  each  iteration  and  the  accuracy  of  the  estima¬ 
tions  of  the  P^  and  P^  necessary  to  achieve  the  desired  detector  performance. 

Finally,  we  note  that  other  simulations  were  conducted  in  which  the 
levels  of  the  quantizer  were  adapted.  The  results  of  this  analysis  indicate 
that,  although  the  levels  do  adapt  to  the  noise,  the  performance  gained  in 
doing  this  is  negligible  when  the  initial  t^  is  "small1',  that  is,  the 
performance  of  the  adaptive  quantizer-detector  using  P°(t^)  in 
the  iteration  process  is  the  same  as  that  using  any  Pg  (t^)  curves  with 
1  <  q  <  n.  This  coincides  exactly  to  the  observations  discussed  in 


r 

*. 


Sections  3.3  and  3.4. 
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APPENDIX 


DERIVATION  OF  THE  UPPER  BOUND  TO  Pg  IN  (2.33) 


From  (2.32),  the  upper  bound  to  P^  is 
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"+"  when  i  =  0,  the  bound  in  (Al)  can  be  written  as 
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(A2)  approaches  1  ,  as  n  -*  ®,  which  is  an  indeterminate  form.  To 
apply  L'Hospital  Rule  we  first  reform  (A2)  into 


exp 


[-  - 
log  ]  22 

Lk»l 


5*1 


-1 


Now  the  fraction  inside  the  outermost  bracket  in  (A3)  is  of  the 
indeterminate  form  as  n  -•  *°. 


A-1B 


Apply  the  L'Hospital  Rule  on  this  fraction  yields  — ,  whi 


we  obtain  the  upper  bound  to  P  in  equation  (2.33). 
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